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Abstract 

The relationship between the length of a word and the maximum length 
of its unbordered factors is investigated in this paper. Consider a finite word 
w of length n. We call a word bordered, if it has a proper prefix which is also 
a suffix of that word. Let )J,(w) denote the maximum length of all unbordered 
factors of w, and let d(w) denote the period of w. Clearly, n{w) < d(w). 

We establish that fJ,(w) = d(w), if w has an unbordered prefix of length 
and n > 2fi(w) — 1. This bound is tight and solves the stronger version 
of a 21 years old conjecture by Duval. It follows from this result that, in 
general, n > 3/i(w) — 2 implies fj.(w) = d(w) which gives an improved bound 
for the question asked by Ehrenfeucht and Silberger in 1979. 

1 Introduction 

Periodicity and borderedness are two properties of words which are investigated in 
this paper. These two concepts — periodicity and borderedness — are fundamental 
and play a role (explicitly or implicitly) in many areas. Just a few of those areas 
are string searching algorithms ^3 El data compression [231 an d codes 

which are classical examples, but also computational biology, e.g., sequence 
assembly |19| or superstrings 0], and serial data communications systems [5] are 
areas among others where periodicity and borderedness of words (sequences) are 
important concepts. It is well known that these two word properties do not exist 
independently from each other. However, it is somewhat surprising that no clear 
relation has been established so far, despite the fact that this basic question has 
been around for more than 20 years. 

Let us consider a finite word (a sequence of letters) w. We denote the length 
of w by \w\ and call a subsequence of consecutive letters of a word factor. The 
period of w, denoted by d(w), is the smallest positive integer p such that the i-th 
letter equals the (i + p)-th letter for all 1 < i < \w\ — p. Let /J.(w) denote the 
maximum length of all unbordered factors of w. A word is bordered, if it has 
a proper prefix that is also a suffix, where we call a prefix proper, if it is neither 
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empty nor contains the entire word. For the investigation of the relationship 
between |tu| and the maximality of fi(w), that is, /i(u>) = d(w), we consider the 
special case where the longest unbordered prefix of a word is of the maximum 
length, that is, no unbordered factor is longer than that prefix. Let w be an 
unbordered word. Then a word wu is a Duval extension (of w), if every unbordered 
factor of wu has at most length \w\, that is, fj,(wu) = \w\. We call wu trivial Duval 
extension, if d(wu) = \w\, or with other words, if u is a prefix of w k for some k > 1. 
For example, let w = abaabb and u = aaba. Then wu = abaabbaaba is a nontrivial 
Duval extension of w since (i) w is unbordered, (ii) all factors of wu longer than 
w are bordered, that is, \w\ — fi(wu) — 6, and (Hi) the period of wu is 7, and 
hence, d(wu) > \w\. Note, that this example satisfies \u\ — \w\ — 2. 

In 1979 a line of research was initiated ^2 IH E01 exploring the relationship 
between the length of a word w and n(w). In 1982 these efforts culminated in the 
following result by Duval: If \w\ > Afi(w) — 6 then d(w) = fJ-(w). However, it was 
conjectured that \w\ > 3/j,(w) implies d(w) = n(w) which follows if Duval's 
conjecture JUj holds true. 

Conjecture 1.1. Let wu be a nontrivial Duval extension of w. Then \u\ < \w\. 

After that, no progress was recorded, to the best of our knowledge, for 20 years. 
However, the topic remained popular, see for example Chapter 8 in |17|. The most 
recent results are by Mignosi and Zamboni [201 an d the authors of this article J3| . 
However, not Duval's conjecture but rather its opposite is investigated in those 
papers, that is: Which words admit only trivial Duval extensions? It is shown 
|2"U) that unbordered, finite factors of Sturmian words allow only trivial Duval 
extensions, with other words, if an unbordered, finite factor of a Sturmian word 
of length n(w) is a prefix of w, then d(w) = [i{w). Sturmian words are binary 
infinite words of minimal subword complexity, that is, a Sturmian word contains 
exactly n + 1 different factors of length n for every n > 1; see |23 or Chapter 2 
in |17j . That result was later improved |KSj by showing that Lyndon words |18| 
allow only trivial Duval extensions and the fact that every unbordered, finite factor 
of a Sturmian word is a Lyndon word. A Lyndon word is a word that is minimal 
among all its conjugates with respect to some lexicographic order, where a word 
uv is a conjugate of vu. 

The main result in this paper is a proof of the extended version of Conjec- 
ture o 

Theorem 1.2. Let wu be a Duval nontrivial extension ofw. Then \u\ < \w\ — 1. 

The example mentioned above shows that this bound on the length of a nontriv- 
ial Duval extension is tight. Theorem 11.21 implies the truth of Duval's conjecture, 
as well as, the following corollary (for any word w). 

Corollary 1.3. If \w\ > 3fi(w) — 2, then d(w) = n(w). 

This corollary confirms the conjecture by Assous and Pouzet in 1 about 
a question asked by Ehrenfeucht and Silberger in |llj . 

Our main result, Theorem 1 1.21 is presented in Section 0] which uses the nota- 
tions introduced in Section[5]and preliminary results from Section|21 We conclude 
with Sectional 
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2 Notations 



In this section we introduce the notations of this paper. We refer to E] f° r 
more basic and general definitions. 

We consider a finite alphabet A of letters. Let A* denote the monoid of all finite 
words over A including the empty word, denoted by e. Let w — wn\Wt2) ' ' ' w (n) 
where Wu) is a letter, for every 1 < i < n. We denote the length n of w by \w\. 
An integer 1 < p < n is a period of w, if Wu-\ — wu+p} for all 1 < i < n — p. 
The smallest period of w is called the minimum period (or simply the period) 
of w, denoted by d(w). A nonempty word u is called a border of a word w, if 
w = uv — v'u for some suitable words v and v' . We call w bordered, if it has 
a border that is shorter than w, otherwise w is called unbordered. Note, that 
every bordered word w has a minimum border u such that w = uvu, where u is 
unbordered. Let n(w) denote the maximum length of unbordered factors of w. 
Suppose w = uv, then u is called a prefix of w, denoted by u < w, and v is called 
a suffix of w, denoted by v =4 w. Let u,v ^ e. Then we say that u overlaps v from 
the left or from the right, if there is a word w such that \w\ < \u\ + \v\, and u < w 
and v =^ w, or v < w and u =4 w, respectively. We say that u overlaps (intersects) 
with v, if either v is a factor of u or u is a factor of v or u overlaps v from the left 
or right. 

Let us consider the following examples. Let A — {a, b} and u,v,w € A* such 
that u — abaa and v = baaba and w — abaaba. Then \w\ — 6, and 3, 5, and 6 are 
periods of w, and d(w) — 3. We have that a is the shortest border of u and w, 
whereas ba is the shortest border of v. We have n(w) = 3. We also have that u 
and v overlap since u <w and v =<; w and \w\ < \u\ + \v\. 

We continue with some more notations. Let w and u be nonempty words where 
w is also unbordered. We call wu a Duval extension of w, if every factor of wu 
longer than \w\ is bordered, that is, /i(wu) = \w\. A Duval extension wu of w is 
called trivial, if d(wu) = /i(wu) — \w\. A nontrivial Duval extension wu of w is 
called minimal, if u is of minimal length, that is, u = u'a and w — u'bw' where 
a, b G A and a ^ b. 

Example 2.1. Let w = abaabbabaababb and u — aaba. Then 

w.u = abaabbabaababb. aaba 

(for the sake of readability, we use a dot to mark where w ends) is a nontrivial Du- 
val extension of w of length \wu\ = 18, where fi(wu) = \w\ = 14 and d(wu) = 15. 
However, wu is not a minimal Duval extension, whereas 

w.u' = abaabbabaababb. aa 

is minimal, with u' = aa < u. Note, that wu is not the longest nontrivial Duval 
extension of w since 

w.v = abaabbabaababb. abaaba 

is longer, with v = abaaba and \wv\ = 20 and d(wv) = 17. One can check that wv 
is a nontrivial Duval extension of w of maximum length, and at the same time wv 
is also a minimal Duval extension of w. 
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Let an integer p with 1 < p < \w\ be called point in w. Intuitively, a point p 
denotes the place between wu,) and if( p +i) in w. A nonempty word u is called 
a repetition word at point p if w = xy with \x\ = p and there exist x' and y' such 
that u =<! x'x and u < yy' . For a point p in u>, let 

d{w,p) — min{|w| | u is a repetition word at p} 

denote the foca^ period at point p in iu. Note, that the repetition word of length 
d(w,p) at point p is necessarily unbordered and d(w,p) < d(w). A factorization 
w = uv, with u, v ^ e and |tt| = p, is called critical, if d(w,p) — d(w), and, if this 
holds, then p is called critical point. 

Example 2.2. The word 

w — ab.aa.b 

has the period d(w) = 3 and two critical points, 2 and 4, marked by dots. The 
shortest repetition words at the critical points are aab and baa, respectively. Note, 
that the shortest repetition words at the remaining points 1 and 3 are ba and a, 
respectively. 

3 Preliminary Results 

We state some auxiliary and well-known results about repetitions and borders in 
this section which will be used to prove Theorem ll.2l in Section 0] 

Lemma 3.1. Let zf = gzh where f,g ^ e. Let az' be the maximum unbordered 
prefix of az. Lf az does not occur in zf, then agz' is unbordered. 

Proof. Assume agz' is bordered, and let y be its shortest border. In particular, y 
is unbordered. If \z'\ > \y\ then y is a border of az' which is a contradiction. If 
\az'\ = \y\ or \az\ < \y\ then az occurs in zf which is again a contradiction. If 
\ a z'\ < \y\ < \az\ then az' is not maximum since y is unbordered; a contradiction. 

□ 

The proof of the following lemma is easy. 

Lemma 3.2. Let w be an unbordered word and u < w and v =4 w - Then uw and 
wv are unbordered. 

The critical factorization theorem is one of the main results about periodicity of 
words. A weak version of it was first conjectured by Schiitzenberger 22 and proved 
by Cesari and Vincent 0. It was developed into its current form by Duval 
We refer to [El for a short proof of the CFT. 

Theorem 3.3 (CFT). Every word w, with \w\ > 2, has at least one critical 
factorization w — uv, with u, v =/= e and \u\ < d(w), i.e., d(w, \u\) = d(w). 

We have the following two lemmas about properties of critical factorizations. 
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Lemma 3.4. Let w = uv be unbordered and \u\ be a critical point of w. Then u 
and v do not overlap. 

Proof. Note, that d(w, \u\) — d(w) — \w\ since w is unbordered. Let \u\ < \v\ 
without restriction of generality. Assume that u and v overlap. If it = u's and 
v = sv' , then d(w, \u\) < \s\ < \w\. On the other hand, if u = sv! and v = v's, 
then w is bordered with s. Finally, if v = sut then d(w, \u\) < \su\ < \w\. □ 

The next result follows directly from Lemma 13.41 

Lemma 3.5. LetuoUi be unbordered and \uq\ be a critical point ofuoU\. Then for 
any word x, we have UiXUi+i, where the indices are modulo 2, is either unbordered 
or has a minimum border g such that \g\ > \uo\ + 

The next theorem states a basic fact about minimal Duval extensions. See ^3] 
for a proof of it. 

Theorem 3.6. Let wu be a minimal Duval extension of w. Then u occurs in w. 

The following Lemmas 13 . 71 13 . 81 and 13.91 and Corollary 1 1 . 31 are given in Let 
ao,ai G A, with do ^ ax, and to G A*. Let the sequences (aj, (sj, (s'A, (s"), and 
(tj), for i > 1, be defined by 

• di = a, ( mo d 2): that is, ai = a$ or = ax, if i is even or odd, respectively, 

• Si such that aiSi is the shortest border of a{ti—x, 

• s[ such that a;+is^ is the longest unbordered prefix of a^+iSi, 

• s'( such that s-s" = Si, 

• ti such that tis'l — U-x. 

For any parameters of the above definition, the following holds. 
Lemma 3.7. For any ag, ax, and to there exists an m > 1 such that 

\sx\ < ■■■ < \s m \ = |i m _i| < • • • < |*o | 
and s m = t m -x and \t \ < \s m \ + |s m _i|. 

Lemma 3.8. Let z < to such that aoz and a\z do not occur in to- Let aozo and 
a\Z\ be the longest unbordered prefixes of aoz and a±z, respectively. Then 

1. if m = 1 then aoto is unbordered, 

2. if m > 1 is odd, then a\s m is unbordered and \to\ < \s m \ + \zq\, 

3. ifm>l is even, then aos rn is unbordered and \to\ \s m \ + 

Lemma 3.9. Let v be an unbordered factor of w of length (i(w) . If v occurs twice 
in w, then fi(w) — d(w). 

Corollary 3.10. Let wu be a Duval extension of w. If w occurs twice in wu, then 
wu is a trivial Duval extension. 
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4 Main Result 



The extended Duval conjecture is proven in this section. 

Theorem 1.2. Let wu be a nontrivial Duval extension of w. Then \u\ < \w\ — 1. 

Proof. Recall that every factor of wu which is longer than is bordered since 
wu is a Duval extension of w. Let z be the longest suffix of w that occurs twice 
in zu. 

If z = e then a =<; w and u — b 3 , where a, b £ A and a ^ b and j > 1, but now 
|u| < \w\ since a6 J is unbordered. Moreover, w = b k aw'a with k < j, otherwise wu 
is a trivial Duval extension, and either aw'aV is bordered, in this case it follows 
j < \w'\, or aw'ab 3 is unbordered. In both cases it follows \u\ < \w\ — 1. 

So, assume z ^ e. We have z ^ w since is otherwise trivial by Corollarv ll.31 
Let a, b € A be such that 



w = w az 



and it = u'bzr 



and z occurs in zr only once, that is, bz matches the rightmost occurrence of z in u. 
Note, that bz does not overlap az from the right, by Lemma \S. 21 and therefore u' 
exists, although it might be empty. Naturally, a ^ b by the maximality of z, and 
«/ 7^ e, otherwise azu'bz < wu has either no border or w is bordered (if azu'bz 
has a border not longer than z) or az occurs in zu (if azu'bz has a border longer 
than z); a contradiction in any case. 

Let azo and bz\ denote the longest unbordered prefix of az and bz, respectively. 
Let do = a and a\—b and to — zr and the integer m be defined as in Lemma l3.8l 
We have then a word s m , with its properties defined by Lemma \'A. 81 such that 

= s m t . 

Consider azu'bzo- We have that az and azu'bzo are both prefixes of oqzu, and bzo 
is a suffix of azu'bzo and oz does not occur in zu'bzo- It follows from Lemma IXT1 
that azu'bzo is unbordered, and hence, 

|azw'6zo| < \w\ . (1) 
w u 



b z r 



zo, \zo. 



Case: Suppose that m is even. Then we have 2 < m and as m (= a m s m ) is 
unbordered and |to| < \s m \ + \z±\ by Lemma f3. 81 

Suppose |io| = \s m \ + \zi\ and z\ = z. Then |s m -i| = \z\ by Lemma ETTI Note, 
that Si < ti—i < to for all 1 < i < m, and hence, it follows that Sj < z for all 
1 < i < m. In particular, s m _i = z. We have that bz (= ais m _i) is a border of 
btm—2 (— a,\t m ^2). But now, 6z occurs in to, and hence, in it, since t, < to, for all 
< z < to, which is a contradiction. 
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So, assume that |*o| < \s m \ + \z±\ or \zi\ < \z\. Suppose \s m \ < \zq\. Then 
\azu'bzo\ < \w\ and 

\u\ = \azu\ — \z\ — 1 

= \azu'bz Q \ - \z \ + \t \ -\z\-l 

< \azu'bz Q \ - \z Q \ + \s m \ + \zi\ -\z\-l 

< M + M-M-i 

< \w\ - 1 

if |*o | < \s m \ + \zx\, or 

\u\ = \azu\ — \z\ 
= \azu'bzo\ - 

< \azu'bzo\ - 

< H + N- 

<\w\-l 

if \zi\ < \z\. We have |u| < \w\ — 1 in both cases. 

Let then \s m \ > \ z o\- We have that unbordered, and since azo is the 

longest unbordered prefix of az, we have az < as m , and hence, \z\ < \s m \. Now, 
azu'bs m is unbordered otherwise its shortest border is longer than az, since no pre- 
fix of az is a suffix of as m , and az occurs in u; a contradiction. So, \azu'bs m \ < \w\ 
and |u| < \w\ — 1, since either \zi\ < \z\ or \t \ < \s m \ + \zi\. 

Case: Suppose that m is odd. Then bs m (= a m s m ) is an unbordered word 
and |to| < \s m \ + \zo\] see Lemma fa. 81 Surely s m e. 

If \s m \ < \z\, then \u\ < \w\ — 1 since 

|m| = \azu'bzo\ — \bzo\ + \bto\ — \az\ 

and \azu'bz \ < \w\, by Q, and |t | < |s m | + |z |. 

Assume thus that \s m \ > \z\, and hence, also z < s m . Since s m ^ s, we 
have \bs m \ > 2, and therefore, by the critical factorization theorem, there exists 
a critical point p in 6s m such that bs m — vqv\, where |«o| = P- 

w u 



a z \ 
■ i i 


vl 


b z r \ 
ii i i 






\ LiiL 

s /' 

i '•() . '•; I 


In particular, 

6z < v$v\ . 




(2) 


Note, that if s m = z then \zq\ < \z\ since b =<; zq 
because it is unbordered. We have therefore in all 


and bs m 
cases 


does not end with b 


\zo\ < \v vi \ - 1 . 




(3) 



\z \ + \to\ - \z\ - 1 

|20| + |Sm| + \zi\ -\z\-l 

\z\-l 
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Let 

U = UqVqViUi 

be such that vqV\ does not occur in u' . Note, that vqVi does not overlap with 
itself since it is unbordered, and vo and v\ do not overlap by Lemma EOI Consider 
the prefix wu' bz of wu which is bordered and has a shortest border g longer than 
z, and hence, bz =4 <?, otherwise w is bordered since z =4 w- Moreover, g < w, for 
otherwise az would occur in u, and hence, bz occurs in w. Let 

W = WgbzWi 

such that bz occurs in wpbz only once, that is, we consider the leftmost occurrence 
of bz in w. Note, that 

\w bz\ < \g\ < \u' bz\ (4) 

where the first inequality comes from the definition of Wq above and the second 
inequality from the fact that |w 6z| < \g\ implies that w is bordered. Let 

/ = bzw^VQVi . 

If / is unbordered, then |/| < \w\, and hence, |u Dot>i| < \wq\. Now, we have 



|uq| < | wo | which contradicts @. 

Therefore, / is bordered. Let h be its shortest border. 

w u 
i i 





a z ; 


u' 


b z r 1 




i u 'o 


b z 

i i i 


: /m .tf\ 


kn b_ z , wi 


\ u o . 


ho Vl. 


Ui I 


h 

l i 


i_ 


h 






\b 


z 




Surely, \bz\ < \h\ otherwise vqVi 
\vo v i | < \h\ otherwise bz occurs in s 
the rightmost occurrence of bz in 


is bordered by J2J. So, bz < h. Moreover, 
m contradicting our assumption that bzr marks 
u. So, VqVi =<! h, and VqVi occurs in w since 



w h < w by (@}. Let 

Wobzv 1 = Woh = w' vqVi . 

Note, that vovi does not occur in w' otherwise it occurs in u' contradicting our 
assumption on u' Q . Moreover, we have h = bzv' =4 u'qVqvi. Let u' vqVi = uoh. 
Consider 



fo = wu bz 

which has a shortest border ho. 

w u 
■ i 







fl. z \ u' Q bz_ 


b z r ; 
■ i i i 






i \v Vl. 




'Wnb Z 
1 M " ■ ■ 


Wi 


': Un b Z 1 
1 u . . . 1 


Ul i 


: ho 









.fo 



S 



Surely, bz =4 ho otherwise w is bordered with a suffix of z. Moreover, |u>o6z| < \Hq\ 
and \ho\ < |wo&^| since bz does not occur in wq and w is unbordered. From that 
and woh = w' voVi and ugh = u'qVqVi follows now \w' \ < \u' \ and 

u' VoVi = uobzv' and uuq occurs in uq. (5) 

Let now 

W = WqVoViwI ■ ■ ■ VoViW , 2 VoViw' 1 VoViW2 

for some word W2 that does not contain vqVi, and 

u = UQVQViu'j ■ ■ ■ VQViu^voViUiVoVit' 

such that vqV\ does not occur in w' k , for all < k < i, or v' e , for all < I < j. Note, 
that these factorizations of w and u are unique, and, moreover, W2 7^ e. (Indeed, 
if W2 = e then v$vi =^ w and az =4 vqVi, and az would occur in u; a contradiction.) 

We claim that either i = j and w' k = u' k , for all 1 < k < i or \u\ < \w\ — 1. 

Assume k = 1. We show that = Consider 

/l = V 1 w[vQV 1 W 2 u' Q VQV 1 u' j ■ ■ ■ VQViU^VQ . 

If /1 is unbordered, then \u\ < \w\ — 1 since |/i| < \w\ and 
M = l/i I - \viw[voViw 2 \ + \vit'\ 

and \t'\ < \zq\ < \z\ < \bz\ < \vqVi \ and W2 ^ e. Assume then that /1 is bordered, 
and let h\ be its shortest border. Clearly, h\ = VigiVo for some g\ (possibly 
gi = e) since vq and t>i do not overlap. We show that hi < uiu^Do. Indeed, 
otherwise either 

1. az occurs in u, in case wiw^ofiW2 < a contradiction to our assumption 
on az, or 

2. wo and t>i overlap, in case |t>o| < M and 

Iwiwiwofi^l - |az| + |i! | < \hi\ < \viw[voViW2\ 
and then vq occurs in z, contradicting Lemma 13.41 or 



3. |u| < I w| — 1, in case V0W3 =4 W2 and \az\ < \voWs\, then vqw^u'vqVi is 
unbordered and the result follows from \t'\ < \vqWs\ — 1, since \az\ ^ \vqw^\ 



for vq does not begin with a. 






Moreover, hi =4 viu[vo since vqVi does not occur in viw[vq- So, let 




w[vo = giVow" and 


wiw'i = ui'fiffi ■ 


(6) 


w 






.VQ.Vl. W[ , V , Vl , W2 j 




Vq Vi t'\ 


! i 5i . . "'' \ 


! < ,«i. 51 


i 


hi 

1 1 


! ft 1 
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Consider, 

f 2 = v w"v 1 W2u' v viu' :j ■ ■ ■ v Viu[v Vi . 
If J 2 is unbordered, then |u| < \w\ — 1 since I/2I < \w\ and 

and \t'\ < \zq\ < \z\ < \bz\ < \vqv-\_\ and w 2 ^ s. Assume then that f 2 is bordered, 
and let /12 be its shortest border. Since vo and v± do not overlap, vqV\ =4 h 2 - 
Also I12 < vqw'{v\ since vqv\ does not occur in w 2 (and vo and v\ do not overlap) 
and az does not occur in h% (and so /12 does not stretch beyond w). We have 
vqw"vi < I12 since vqvi does not occur in vow"v\ unless w" = e. Hence, we have 
h>2 = Vow"vi and 

w'±vqVi = (71/12 and /12 ^ u^vqVi . (7) 



w i v_ 







_V .Vi. u' x 


W «1 


■ 




\ < gi 










< , 




h 2 




/l 2 \ 



h 

Consider, 

/a = VQV\w 1 VQV\W2U Vl VQV\Uj ■ ■ ■ v viu 2 v u"vi . 
If is unbordered, then \u\ < \w\ — 1 since < |io| and 

M = I/3I - \v viw' 1 v v 1 w 2 \ + \giv vitf\ 

and |t'| < \z \ < \z\ < \bz\ < \vqVx\ and \gi\ < \w[\ and W2 ^ e. Assume, that 
fs is bordered. Then / 3 has a shortest border h 3 such that v vi < hz- We have 
/13 = vqu'{vi by the arguments from the previous paragraph. Moreover, 

voviu'i = h?,gi and voviw[ < . (8) 
w u 



.VQ v u 


Vj[ Vq . V\ . W2 \ 


.VQ V!. u[ 






31 . «o . w" j 


; I . ''1 . gi 


i 










\ h 3 




\ h, \ 





Observe, that J7J and (JSJ) imply that the number of occurrences of v\ and vq, 
respectively, is the same in w[ and u[ since and v\ do not overlap. Now, let 

h\ = wisi^o = h'lvih'-yVo = vih' v h' ' 

where v\ and vq occur only once in vih^ and h' Vo, respectively. 
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w 



u 



i i i_i 


w[ , v , Vi . w 2 \ 


v , t»i . u[ 


. vo .vi,t'\ 


i 


9i . vo . w" \ 


\ u " . v i. 9i 




i 


K . vo K \ 


k.vi. k 


i 


Now, let 


f 2 = v how"viW2U v ViUj 


■ ■ ■ VqViU^VqVi 




and 


f 3 = v viw' 1 voViw 2 u' v viu' j ■ 


■ VoViu 2 vou'{h'{vi 





with the respective shortest borders h' 2 and h' 3 (which are both not empty, if 
M > \ w \ — 1; as in the case of f 2 and 73) and vqVi =^ h' 2 and vqVi < h' 3 . 

We have h' 2 < vohQw'{vi since vqVi does not occur in w 2 and az does not occur 
in h' 2 (and so h' 2 does not stretch beyond w). We have Voh f Qw'{vi < h 2 since vqv± 
does not occur in w[. Hence, we have h' 2 = Voh^WiVi an< ^ 



wi^o^i = h voh 2 w 1 vi — h' h 2 and ft. 2 w^ofi ■ 



wo v\ t' 



j 1 



9i . vq . wi 



tip , vq X) 



h'n 



"1 



5i 



w ,/i 



. ^0 . fe o. < ; 



/2 

We have h' 3 = VQu'{h'{vi by the arguments from the previous paragraph. More- 



over, 



V0V1U1 = vou'lh'lvih^ — h 3 h[ and voviw[ < h' 3 
w u 



"'1 



v vi t' 



j __i 1 



51 . VQ . Wl 



h\'.vi. K \ 



"i .""1. 5i 



It is now straightforward to see that 



w'l = u'{ = e 



for otherwise v± and vo occur more than once in v\hl x and h va, respectively. 
From |JB} follows now 

w[ = gi = u[ . 
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Assume 1 < k < min{i,j} and w[ = u' e , for all 1 < £ < k. Let us denote both 
w' e and u'g by v' e , for all 1 < £ < k. 
We show that w k — u' k . Consider 

U = V 1 w' k V Viv' k ^ l VQV 1 ■ ■ ■ v' 1 V V 1 W 2 u' V V 1 u' J ■ ■ ■ VqViU^Vq . 

If f 4 is unbordered, then |u| < \w\ — 1 since I/4I < \w\ and 

M = |.A| - \viw' k v v 1 v k _ 1 v v 1 ■ ■ ■ v'^qv^I + \viv k _ 1 v v 1 ■ ■ ■ v'^ov^'l 

and \t'\ < \zo\ < \z\ < \bz\ < \vqVi\ and w 2 7^ £■ Assume, f 4 is bordered. Then f 4 
has a shortest border h 4 such that \vqV\ \ < \h 4 \. Let h 4 = vig 4 v$. 
If Iwiw^wol < \h 4 \ then there exists an £ < k such that 

h 4 = viw^vov-iv'^vqVi ■ ■ ■ v' e+1 v viv"v 

where v" < v' e . That implies 

u'k = 4 

since v$vi does neither occur in nor in u' k . Now, consider 

h = viw^vqViv'^vqVi ■ ■ ■ v' 1 v viw 2 u' Q v viu' j ■ ■ ■ vqViu^vqViv'^vqVi ■ ■ ■ v"v . 

If /5 is unbordered, then |u| < \w\ — 1 since I/4I < |/s|, see above. Assume, f§ is 
bordered. Then / 5 has a shortest border h 5 such that 

\h 4 \ < \h 5 \ 

for otherwise h 4 is not the shortest border of f 4 , since either h 4 < /15 or h 5 < h 4 , 
and the latter implies that h 4 is bordered, and hence, not minimal. But now, we 
have a £' < £ such that 

h 5 = viw^vqViv'^voVi ■ ■ ■ v' ll+1 VoViv'l, v 

where v", < v' t . We have I/4I < I/5I < where 

fe = -uiWfeWoviVfe-i^i • • ■ v[vov 1 w 2 u' v viUj ■ • • i^iUfc«owiVfc-i w o«i ■ • ■ v'l,v , 

which is either unbordered and |u| < \w\ — 1 since I/4I < I/5I, or it is bordered 
with a shortest border he, and we have \h 4 \ < \h 5 \ < |/i 6 | and a factor fr, such 
that I/4 1 < I/5 1 < I fe\ < I/7 1, and so on, until eventually an unbordered factor is 
reached proving that |w| < \w\ — 1. 

Assume then that h 4 < viiv' k vo- We also have that h 4 =4 viu' k v since v vi 
does not occur in w k . So, let w' k v = g 4 v w k and v\v! k = u' k 'vig 4 . 

Consider, 

f S = VoWkViv'^VoVi ■ ■ ■ v' 1 vov 1 W2u' v viu' j v v 1 ■ ■ ■ u' k v vi . 
If fs is unbordered, then |u| < \w\ — 1 since |/ 8 | < \w\ and 

M = l/sl - \vow'kVi v 'k-i v oVi ■ ■ ■ v'^qv^I + \v' k _ 1 v v 1 ■ ■ ■ v'^qv^'I 
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and \t'\ < \z \ < \z\ < \bz\ < \vgVx\ and w 2 ^ £■ Assume, fg is bordered. Then / 8 
has a shortest border hs such that VgVi =4 ha- 
lf \h$\ > \vqw^.vi\ then the same argument as in the case |viw^o| < \h^\ above 
shows that \u\ < \w\ — 1. If \hg\ < |i>ou4'i;i| then vqV\ occurs in w' k ; a contradiction. 
Hence, we have hg, = Vqw'^Vi and 

w' k v vi = gih s and hg =4 u' k v vi . (9) 

Consider, 

fg = VQViw'kVQVxv'k^vaVi ■ ■ ■ v' l v a viw 2 u' v a viu' j v vi ■ ■ ■ u' k+1 Vgu'lvi . 
If fg is unbordered, then \u\ < \w\ — 1 since |/g| < \w\ and 

M = 1/9 I - K«lWfe«OVlUfc_ 1 Uo'Ul • • • v[vqViW 2 \ + IgiVQViv'^iVQVi ■ ■ ■ v[v vit'\ 

and \t'\ < \z \ < \z\ < \bz\ < \v vi\ and | (74 1 < \w k \ and w 2 7^ e. Assume, fg 
is bordered. Then fg has a shortest border hg such that vqVi < hg. We have 
hg = v u k vi by the arguments from the previous paragraph. Moreover, 

v viu' k = h g gx and hg < v viw' k . (10) 

Observe, that 10 and (|10|l imply that the number of occurrences of v\ and vq, 
respectively, is the same in w' k and u' k since vg and v 1 do not overlap. Now, let 

hi = vig 4 v = h'lvih^vg = vih' v h' ' 

where v\ and vg occur only once in vih[ and h' vg, respectively. 
Now, let 

/g = Wo/lo^fc^l^fe-l ' ' ' V V 1 v' 1 V V 1 W 2 .u' VgV 1 u' J ■ ■ ■ VgVxU^VgV! 

and 

fg = VgViw'f.VgViv'^ ■ ■ ■ V Q V 1 v' 1 V ViW 2 .u' V Q V 1 u' j ■ ■ ■ VgViU^Vgu'lh'lvi 

with the respective shortest borders h' s and h g (which are both not empty, if 
M > \ w \ — 1; as in the case of fg and fg). Analogously to the cases of fg and fg, 
we have 

w' k vgvi = h' h' 8 and vgviu' k — h' 9 h\ . 
It is now straightforward to see that 

K = K= VgVi 

and 

h 4 = Vgw'f.Vx = VgU k Vi 

and hence, w' k = u' k . In this case, we denote both w k and u' k by v' k . 
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Now, we have 

v — v viw[ ■ ■ ■ VoViw' 2 voViw' 1 

where i = min{i, j}. 
If i < j then 

\ w 'o\ < lu'QVovm'j ■ ■ -v viu' i+1 \ (11) 
since \w' \ < \u' \ by ©. Let 

fn = viw 2 u' Q VQViu' j ■ ■ ■ v Viu' i+1 vv . 

Then \w\ < |/n| by (JTIJ, and hence, /n is bordered. Let hu = vignvo be the 
shortest border of / n . Recall, that w 2 ^ £ and either az ^ v\w 2 or Viw 2 =^ az. If 
IU1W2I < \az\ then v\ necessarily occurs in z, and hence, it overlaps with v (since 
bz < v vi); a contradiction. So, we have az =4 viw 2 . Surely, \hn\ < 1^1.^2 1 (and so 
hn < v\w 2 ) for otherwise az occurs in u which contradicts our assumption that z 
is of maximum length. Let w 2 = gnvow^. Note, that l^o^l 7^ \&z\ since az and vq 
begin with different letters. We have \az\ < \vow$\ since otherwise vq occurs in z, 
and hence, overlaps with v\ which is a contradiction. Consider, 

fi2 = vowsu'nvovxu'j ■ ■ ■ v Viu' i+1 vv Vi . 

If /12 is unbordered, then \u\ < \w\ — 1 since I/12I < \w\ and 

M = I/12I - \v w 5 \ + \t'\ 

and \az\ < {vqW^I and \t'\ < |zo| < \z\ < \bz\ < \vqw^\. Assume, fi 2 is bordered. 
Then fi 2 has a shortest border h\ 2 = g\2VQV\ with \az\ < \hi 2 \, for otherwise az 
occurs in u. Let vo^s = 9i2VqViWq. But, now 

w = w' a vv vig 12 v Q viWa 

where vqV\Wq =<! w 2 , contradicting our assumption that VqV\ does not occur in w 2 . 
If i > j then 

w — w'qVqViwI ■ ■ ■ v Q viw' : j +1 vvQViw 2 and u — u' vvoVit' 

and \w\ > \u\ — \t'\ + \vqVi\. We have \u\ < \w\ — 1 since \t'\ < \z \ < \v vi\ — 1 
by ©. 

Assume i = j. Then 

w — w' vvoViw 2 and u = u' Q vvoVit' . 

Consider 

If /' is bordered, then it has a shortest border h! = vig'vQ. 
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w 



w 2 



b z r I 
■ i i i 



\9 . v o . 

h! 



Vi a i 
ti 



r 

Recall, that W2 ^ e and either az ^ V1W2 or v±W2 =4 az. If |viu>2 < \az\ then 
vi occurs in z, and hence, overlaps with vq since bz < vqVi; a contradiction. So, 
we have az =<! viW2- Surely, \h'\ < \viW2\ for otherwise az occurs in u which 
contradicts our assumption. Let W2 = c/vqW/i. Note, that |^oW4| ^ \az\ since 
az and vq begin with different letters. We have \az\ < \vqW41 since otherwise 
occurs in z, and hence, overlaps with vi which is a contradiction. Consider now, 

/" = V Q W4u' VV Vi . 

If /" is unbordered, then it easily follows that |u| < \w\ — 1 since we have \t'\ < \az\ 
and \az\ < \vqW4\. 



b z 

I I 



v vif\ 



h" 



h" 



a .vq .vi\ 



f" 



a . vq . vi \ 



If /" is bordered, then it has a shortest border h" = g"vovi with \az\ < \h"\, for 
otherwise az occurs in u. Let i>qW4 = g"voViw§. But, now 



w = w vv vig g v viw 5 

which contradicts our assumption that w = w' vvqViW2 and vqVi does not occur 
in W2- 

If /' is unbordered, then |/'| < \w\, and hence, \w' \ > \u' \. But, we also have 



0q| < |ttg|; see ©. That implies 



Moreover, the factors wq and bzv' 



have both nonoverlaping occurrences in u'qVqVi by ©. Therefore, 



u . Now, 



xaw7 



and 



= xbt" 



where w'qVVqVi < x and a, b G A and a ^ b and w-? =^ w 2 and t" =<! if . We have 
that xb occurs in w by Theorem 13.61 Since xb is not a prefix of w and vqVi does 
not overlap with itself, we have \xb\ + \voVi\ < \w\. From \t'\ < \z \ < \vqVi\ — 1 
we get | it | < \w\ — 1 and the claim follows. □ 

Note, that the bound |u| < \w\ — 1 on the length of a nontrivial Duval extension 
uuu of w is tight, as the following example shows. 
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Example 4.1. Let w — a n ba n+m bb and u = a n+m ba n with n,m> 1. Then 

lu.u = a n ba n+m bb.a n+m ba n 

is a nontrivial Duval extension of w and \u\ = \w\ — 2. 

In general, Duval |ll)j proved that we have d(w) = u(w), for any word w, if 
\w\ > 4/i(w) — 6. Duval also noted that already \w\ > 3u(w) implies d(w) — (J.(w), 
provided his conjecture holds. Corollary II .31 follows from Theorem ll.2l 

Corollary 1.3. If \w\ > 3/i(w) — 2 then d(w) — n(w). 

However, this bound is unlikely to be tight. The best example for a large 
bound that we could find is taken from pQ. 

Example 4.2. Let 

w = a n ba n+1 ba n ba n+2 ba n ba n+1 ba n . 
We have \w\ — In + 10 and n(w) = 'in + 6 and d(w) = 4n + 7. 

So, we have that the precise bound for the length of a word that implies 
d(w) = n(w) is larger than 7/3u(w) — 4 and not larger than 3u(w) — 2. The 
characterization of the precise bound of the length of a word as a function of its 
longest unbordered factor is still an open problem. 

5 Conclusions 

In this paper we have given a confirmative answer to a long standing conjecture 
|10j by proving that a Duval extension wu of w longer than 2\w\ — 2 is trivial. 
This bound is tight and also gives a new bound on the relation between the length 
of an arbitrary word w and its longest unbordered factors n(w), namely that 
\w\ > 3/i(w) — 2 implies d(w) — fJ,(w) as conjectured (more weakly) in [TJ. We 
believe that the precise bound can be achieved with methods similar to those 
presented in this paper. 
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